Picture for Xiaoliang Dai

Xiaoliang Dai

Non-Markov Multi-Round Conversational Image Generation with History-Conditioned MLLMs

Add code
Jan 28, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation

Add code
Dec 31, 2025
Viaarxiv icon

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

Add code
Apr 24, 2025
Viaarxiv icon

MoCha: Towards Movie-Grade Talking Character Synthesis

Add code
Mar 30, 2025
Figure 1 for MoCha: Towards Movie-Grade Talking Character Synthesis
Figure 2 for MoCha: Towards Movie-Grade Talking Character Synthesis
Figure 3 for MoCha: Towards Movie-Grade Talking Character Synthesis
Figure 4 for MoCha: Towards Movie-Grade Talking Character Synthesis
Viaarxiv icon

DirectorLLM for Human-Centric Video Generation

Add code
Dec 19, 2024
Figure 1 for DirectorLLM for Human-Centric Video Generation
Figure 2 for DirectorLLM for Human-Centric Video Generation
Figure 3 for DirectorLLM for Human-Centric Video Generation
Figure 4 for DirectorLLM for Human-Centric Video Generation
Viaarxiv icon

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity

Add code
Dec 13, 2024
Figure 1 for LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Figure 2 for LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Figure 3 for LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Figure 4 for LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
Viaarxiv icon

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Add code
Dec 03, 2024
Figure 1 for Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Figure 2 for Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Figure 3 for Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Figure 4 for Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
Viaarxiv icon

Towards Automated Model Design on Recommender Systems

Add code
Nov 12, 2024
Viaarxiv icon

Movie Gen: A Cast of Media Foundation Models

Add code
Oct 17, 2024
Figure 1 for Movie Gen: A Cast of Media Foundation Models
Figure 2 for Movie Gen: A Cast of Media Foundation Models
Figure 3 for Movie Gen: A Cast of Media Foundation Models
Figure 4 for Movie Gen: A Cast of Media Foundation Models
Viaarxiv icon